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A TWO-STAGE  MINIMAX  PROCEDURE  WITH  SCREENING 
FOR  SELECTING  THE  LARGEST  NORMAL  MEAN  (II): 

AN  IMPROVED  PCS  LOWER  BOUND  AND  ASSOCIATED  TABLES 


A j it  C . Tamhane 

Northwestern  University,  Evanston,  Illinois 
Robert  E.  Bechhofer 
Cornell  University,  Ithaca,  New  York 


ABSTRACT 

This  paper  is  a follow-up  to  an  earlier  article  by  the 
authors  in  which  they  proposed  a two-stage  procedure  with 
screening  to  select  the  normal  population  with  the  largest 
population  mean  when  the  populations  have  a common  known  variance. 
The  two-stage  procedure  has  the  highly  desirable  property  that 
the  expected  total  number  of  observations  required  by  the  pro- 
cedure is  always  less  than  the  total  number  of  observations 
required  by  the  corresponding  single-stage  procedure  of  Bechhofer 
(1954) , regardless  of  the  configuration  of  the  population  means . 
The  present  paper  contains  new  results  which  make  possible  the 
more  efficient  implementation  of  the  two-stage  procedure.  Tables 
for  this  purpose  are  given,  and  the  improvements  achieved  (which 
are  substantial)  are  assessed. 


. INTRODUCTION  AND  SUMMARY 


The  present  paper  is  a follow-up  to  Tamhane  and  Bechhofer 
(1977)  (henceforth  referred  to  as  T-B)  and  contains  some  new 
results  which  make  possible  the  more  efficient  implementation  of 
the  two-stage  procedure  proposed  in  Section  4 of  T-B.  In  order 
to  make  the  present  paper  somewhat  self  contained,  certain  results 
from  T-B  are  repeated  here  (without  proof);  the  reader  is  referred 
to  T-B  for  background  and  motivation  as  well  as  for  the  necessary 
proofs . 

In  T-B  we  studied  in  depth  a two-stage  procedure  (P  ) for 
selecting  the  largest  normal  mean.  This  procedure  (which  employs 
the  indifference-zone  approach  of  Bechhofer  (1954))  screens  out 
"noncontending"  populations  in  the  first  stage  and  selects  the 
"best"  population  from  among  the  "contending"  populations  which 
enter  the  second  stage.  In  order  to  determine  the  constants 
necessary  to  implement  P^ , we  proposed  in  T-B  the  criterion  of 
minimizing  the  maximum  (over  the  entire  parameter  space)  of  the 
expected  total  sample  size  required  by  P^  subject  to  the 
procedure's  guaranteeing  a specified  probability  requirement.  As 
a consequence,  P^  based  on  this  unrestricted  minimax  (U-minimax) 
design  criterion  possesses  the  highly  desirable  property  that  the 
expected  total  sample  size  required  by  P^  is  always  less  than 
the  total  sample  size  required  by  the  best  competing  single-stage 
procedure  (P^)  of  Bechhofer  (1954),  regardless  of  the  true 
configuration  of  the  population  means. 

As  noted  in  Section  10  of  T-B,  there  were  two  main  unsolved 
problems  associated  with  P 2 applied  to  three  or  more  (k  ^ 3) 
populations.  First,  the  so-called  least-favorable  configuration 
( LFC ) of  the  population  means  has  not  yet  been  determined  for 
k 3^  3;  knowledge  of  this  configuration  is  required  in  order  to 
determine  the  best  set  of  constants  necessary  to  implement  P . 
Second,  even  if  the  LFC  of  the  population  means  were  known  for 
k > 3,  the  problem  of  evaluating  the  probability  of  a correct 
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selection  (P(CS})  associated  with  ?2  (see  (5.1)  of  T-B)  when 
the  population  means  are  in  that  configuration  would  still  remain; 
it  is  extremely  difficult  and  costly  to  evaluate  the  exact  P{CS) 
associated  with  P^  on  a computer,  even  if  the  population  means 
are  in  the  so-called  "slippage"  configuration  (see  (5.6)  of  T-B). 

It  is  possible  to  determine  a set  of  constants  (although 
not  the  best  set)  to  implement  P^  if  a lower  bound  to  the 
P(CS}  of  P^  can  be  found  and  the  LFC  of  the  population  means 
determined  for  that  lower  bound;  such  a set  of  constants 
provides  a conservative  solution  to  the  problem.  It  was  this 
device  that  we  adopted  in  T-B  in  order  to  circumvent  the  first 
unsolved  problem;  it  turned  out  that  it  was  straightforward  to 
determine  the  LFC  of  the  population  means  for  the  lower 
bound  that  we  adopted  (see  (5.8)  of  T-B),  and  in  addition  the 
integrals  associated  with  that  lower  bound  proved  to  be  very  easy 
to  compute. 

A referee  of  T-B  proposed  a new  lower  bound  to  the  P(CS} 
of  P^  (see  Section  11  of  T-B),  his  bound  being  uniformly  superior 
to  ours;  it  is  also  straightforward  to  determine  the  LFC  of  the 
population  means  relative  to  the  referee's  lower  bound,  and  the 
resulting  integrals  are  easy  to  evaluate.  Using  this  new  lower 
bound  we  could  have  computed  a new  set  of  constants  to  implement 

?2»  ar'd  thereby  obtained  a less  conservative  solution  to  our  problem. 

In  the  present  paper  we  obtain  a third  lower  bound  to  the 
P(CS}  of  ?2  — one  which  is  uniformly  superior  to  the  referee's, 
and  relative  to  which  the  LFC  of  the  population  means  is  easily 
obtained.  However,  it  is  quite  a bit  more  difficult  and  costly 
(although  not  prohibitively  so)  to  evaluate  the  resulting  function 
than  was  the  case  for  our  original  bound  or  for  the  referee's. 

It  turned  out  that  such  computations  were  justified  since  results 
obtained  with  this  newest  bound  yield  a significant  improvement 
over  all  of  our  previous  results.  We  make  these  ideas  precise 
in  the  next  sections.  The  comparisons  between  our  new  results 
and  our  previous  ones  are  made  in  Section  5.  In  Section  6 a 
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numerical  example  is  given  which  illustrates  the  options  (in  terms 
of  choice  among  procedures)  available  to  the  experimenter,  and 
the  strikingly  different  consequences  associated  with  each  option. 


2 . PRELIMINARIES 


2.1  Assumptions 

Let  n.  (1  4 i 4 k)  denote  a normal  population  with  unknown 
1 2 

mean  u^  and  known  variance  o , and  let  ft  denote  the  para- 
meter space  of  vectors  y_  = (y  ,...,y  ).  Denote  the  ranked  values 


of  the  y. 


by  uci]  i 


i Ur 


and  let  6.  . = yr 


i ' li j - = ur  XCL  ui,j  " "a]  ' u[j]' 

We  assume  no  prior  knowledge  concerning  the  pairing  of  the 


II.  with  the 


Tj] 


(1  < i,j  4k).  Any  one  of  the  populations 


(if  there  is  more  than  one)  with  y-value  equal  to  y 
regarded  as  "best." 


[k] 


is 


2.2  Goal  and  Probability  Requirement 

The  goal  of  the  experimenter  is  to  select  a best  population. 
This  event  is  referred  to  as  a correct  selection  (CS).  The 
experimenter  restricts  consideration  to  procedures  (P)  which 
guarantee  the  probability  requirement 

P {CS|P}  > P*  for  all  v e £2(6*)  (2.1) 

where  {5*,P*}  0 < 6*  < °°,  1/k  < P*  < 1 are  specified  prior  to 
experimentation,  and  ft(6*)  = {y_  e ft|6  ^ 5*}. 

2.3  Two-stage  Procedure  (P^) 

In  T-B  we  proposed  a two-stage  procedure  P^  = P2(n^,n^,h) 
(previously  considered  by  Alam  (1970))  which  depends  on 
nonnegative  integers  n1»n2  anc*  a real  constant  h ^ 0.  The 
constants  (n^,^,)})  depend  on  k and  {6*,P*},  and  are 
chosen  so  that  P^  guarantees  (2.1)  and  possesses  a certain 
minimax  property  (given  by  2.2)). 

Procedure  P^ 

Stage  1:  Take  a random  sample  of  size  n^  from  each  11^  and 
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compute  the  sample  mean  (1  < i < k),  Let  xl1^  = 

—(1)  1 *-kj 

max  X.  . Determine  the  subset  I of  {l,2,...,k}  where 
1414k  1 

I = {ilX^^X^j  - h}.  The  populations  II  ,...,11^  with  sub- 
scripts in  I are  the  ones  which  enter  the  second  stage  (if  I 
has  more  than  one  element).  Let  denote  the  set  of  populations 

with  subscripts  in  I . 

a)  If  II  contains  only  one  population,  stop  sampling  and 


assert  that  the  population  associated  with  X 


v(D 


CkJ 


is  best. 


b)  If  n contains  more  than  one  population,  proceed  to 
the  second  stage. 

Stage  2 : Take  a random  sample  of  size  n from  each  II.  with 

• — ( 2 ) ^ 1 

1 e I and  compute  the  sample  mean  X.  . Compute  the  cumulative 

_/  -1  \ / o \ 1 

sample  mean  X.  = (n.X;  + n X.  ',)/(n,+n.)  for  each  II.  with 

1 li  2i  12  _ 1 

i e I.  Assert  that  the  Dopulation  associated  with  Xr,  , = max  X. 

Ck]  iel  1 

is  best.  1£i 


Remark  2.1:  A two-stage  procedure  of  Somerville  (1974)  which  is 
related  to  ours  has  recently  come  to  our  attention.  His  procedure 
eliminates  a predetermined  number  of  populations  at  the  end  of 
the  first  stage  whereas  ours  eliminates  a random  number;  thus  for 
favorable  configurations  of  the  population  means , his  procedure 
always  requires  two  stages  and  a fixed  total  number  of  observa- 
tions, whereas  ours  often  requires  only  a single  stage  or  two 
stages  with  a small  total  number  of  observations.  Moreover,  after 
the  first- stage  data  are  used  to  determine  which  populations  enter 
the  second  stage,  Somerville's  procedure  ignores  the  information 
concerning  the  population  means  obtained  in  the  first  stage.  Our 
procedure  not  only  uses  the  first  stage  data  to  determine  which 
populations  enter  the  second  stage , but  also  for  those  populations 
which  do  enter  the  second  stage  our  procedure  pools  the  first  stage 
and  second  stage  information  concerning  the  associated  population 
means;  thus  our  procedure  makes  fuller  use  of  the  information  in 
the  total  experiment. 
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Let  T denote  the  total  sample  size  and  E^{T|P^}  denote 
the  expected  total  sample  size  required  by  . In  T-B  (Section 
4.2)  we  proposed  the  following  unrestricted  minimax  (U-minimax) 
design  criterion  to  determine  (n  ,n  ,h)  guaranteeing  (2.1). 

2.4  U-minimax  Design  Criterion 

For  given  k and  specified  { 6 * , P* } choose  (n^,n2»h)  to 

minimize  Sup  E { T | P } 

- (2.2) 

subject  to  Inf  P { CS  | P } >,  P*, 
pefi(6*)  - 2 

where  n^,n2  are  nonnegative  integers  and  h >.  0.  We  denote  by 
(nl’n2’h|E)  the  exact  solution  to  (2.2),  and  by  P2(E)  the 
procedure  using  that  solution.  If  a lower  bound  on  P^ { CS | P^ } 
is  used  in  the  l.h.s.  of  the  constraint  of  (2.2),  then  we  denote 
by  (n  ,n2,h|C)  the  corresponding  conservative  solution  to  (2.2), 
and  by  P9(C)  the  procedure  using  that  solution. 

3.  AN  IMPROVED  LOWER  BOUND  ON  P {CS|P2}, 

AND  THE  ASSOCIATED  U-MINIMAX  OPTIMIZATION  PROBLEM 

3.1  Improved  Lower  Bound  on  P^ { CS | P^ } 

Our  new  lower  bound  is  given  in  the  following  theorem  which 
is  proved  below. 


Theorem  3.1:  For  all  p e !]  we  have  the  following  inequalities : 


VCS|P2}  > A > B > C 


where 


A = 


,00 

.«  k-1 

n 

j -00  • 

• i 2 

_oo  ! = 1 

f(6k,i+h)^T 


- x 


1’ 


6,  . /m 

xo\^ 


d$2[x1,x2|>^'], 


X 


(3.1a) 
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I. 


B = 


f»  k-1  ^ 

II  $[x  + (5,  .+h)i^n7/a]d$(x) 


i=l 
r“  k-1 


k,i 


II  $[x  + <5  . i/m/aJd'JCx)) , 

-oo  i=i  K’1 


00  k-1 

II  #[x  + (<S  .+h)/n~/a]d|I,(x) 

-co  i=i  K>1  1 

00  k-1 

II  <I>[x  + 6 ./in/a]d$(x)  - 1. 

J-oo  i = i k’1 


(3.1b) 


(3.1c) 


Here  $ (*,*|p)  denotes  the  standard  bivariate  normal  cdf  with 
correlation  coefficient  p (-1  < p < 1);  $( • ) denotes  the 

standard  univariate  normal  cdf,  m = n^  + n^ , and  p = n^/m. 

Proof:  Let  and  X^^  denote  the  first  stage  sample 

mean  and  the  cumulative  (first  stage  plus  second  stage)  sample 
mean,  respectively,  from  the  population  with  mean 
(1  4 i 4 k) . Then 

Vcslp2>  iV*(io  iSu>  - hl  *<*>  >fu) 

(3.2) 

( (<S,  ,+h)  JrT  6,  . r ^ 

• p{°i  i H-  vi  i if a & 1 i k-i>) 

where  U.  = (X^  - X*1^  + 6,  .)/n_/a/2  and 

i (i)  (k)  k,i  1 

V.  = (X...  - X.,  . + 6,  . )y/m7a/2  (1  4 i < k-1).  We  note  that  the 

l (l)  (k)  k,i  — = 

U.  and  V.  are  standard  normal  r.v.'s  with  Corr{U.,U.}  = 
li  i ] 

Corr{V.,V.}  = 1/2,  Corr{U.,V.}  = Jp,  and  Corr{U.,V.}  = Jp/2 
i ] i i i ] r 

(i  t j;  1 4 i»j  4 k-1);  thus  using  the  representation  (see 

Bechhofer  and  Tamhane  (1974))  (U.,V.)'  = (Y, .+Y, . ,Y„ .+Y  n)' f-Jl 

ii  li  10  2i  20 

(1  4 i 4 k-1)  where  the  (Y^-j^-j)'  (0  4 j 4 k-1)  are  i.i.d. 


rl  /$) 

ly 


we  obtain  A of  (3.1a)  which  is  our  new  lower  bound. 


The  referee ' s lower  bound  3 is  obtained  by  replacing  ail  of  the 
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correlations  between  the  Lh  and  the  V_.  (1  < i,,  < k-1)  by 

zeros  and  applying  Slepian's  inequality.  The  inequality  between 
B and  our  original  lower  bound  C was  shown  in  Section  11  of  T-B. 

We  note  that  for  A,  B,  and  C the  LFC  of  the  population 
means,  i.e.,  the  configuration  of  the  population  means  which 
minimizes  A,  B,  and  C subject  to  £6  0(6*),  is  given  by 
Pj-jj  = = U[k]  ~ ^ * (which  is  also  the  conjecturea  LFC 

for  the  exact  P^{CS | P2 } subject  to  jj  e 0(6*)).  Thus  we  now 
obtain 

Corollary:  For  all  u_  e 0(6*)  we  have  the  following  inequalities: 


where 


P^{CS | P2 } > A( 5* ) > B ( 6 * ) > C ( 5 * ) 


(3.3) 


A(5*)  = 


f(6*+h)^ 


6*v'm"  i /— 

' V ~ 


d$2Cx1,x2  | i/p]. 


(3.4a) 


B (6*)  = < $ [x  + (6*+h)vn^/o]d$(x) 


x < $ [x  + 5*/m/a]d$(x) ) , 


(3.4b) 


C(6*)  = $ [x  + ( 6*+h)^n^/a]d|J>(x) 


foo 

i[x  + 6*/n/a]d$(x)  - 1 , 


(3.4c) 


In  the  optimization  problem  (2.2)  we  shall  replace  the  con- 
straint by  A(6*)  >_  P*;  we  denote  by  (n^,n2>h|C^)  the  corre- 
sponding conservative  solution  and  by  ^(^ ) the  associated 
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procedure.  Similarly,  conservative  solutions  and  procedures 
result  if  we  employ  the  constraints  B ( 6 * ) P*  or  C(6*)  >_  P* 

in  (2.2)  obtaining  (n^n^h-lc^)  and  P2^C2^’  and  (n^n^hl^) 
and  ^2^3^’  respectively. 


Remark  3.1:  If  we  let  h -*■  « in  A,  B,  or  C of  (3.1a),  (3.1b), 
(3.1c),  respectively,  we  obtain  in  each  case 

/ ] $[x  +6,  .yWcr]d$(x)  which  is  an  expression  for  P {CSlP,} 

where  P uses  a common  single-stage  sample  size  m per  popula- 
tion. Therefore  P^  is  a special  case  of  any  based  on  either 

(3.1a)  or  (3.1b)  or  (3.1c).  Now  it  was  shown  in  Bechhofer  (1954) 


that 


Inf  P (CS | P } = 
pefl(6*)  - 1 


f° o 

"'■[x  + 6:'{>^/a]d$(x) . 

J —00 


(3.5) 


Thus  if  we  let  n denote  the  smallest  value  of  m for  which  the 
r.h.s.  of  (3.5)  is  equal  to  or  greater  than  P*  (i.e.  , n is 
the  smallest  single-stage  sample  size  that  guarantees  (2.1)),  then 
for  i = 1,2,3  we  have 

E (T|P0(C.)}  < kn  (3.6) 

p.  2 l — 

for  all  p e ft. 


3.2  U-minimax  optimization  problem 


Before  we  state  our  optimization  problem  we  cite  the  following 
results  given  by  Theorems  6.1  and  6.2,  respectively,  of  T-B: 

(1)  For  any  jj  e SI  we  have 


E (T  P ) 
P 2 


(6,  . +h)/n  ~ 

+ iuJ L 

a 


d<?(x) , 


(3.7) 


and 


9 


« 


(2)  Sup  E {T|P  } = kn. 
peft  — 1 


(3.8) 


kn2  ($k  1[x  + h*/n~/a]  - $k-1[x  - h/n~/0]}d<S>(x) 

J — OO  -L 

(referred  to  as  the  equal  means 


which  occurs  when  ur  t = y..,  , 

LlJ  LkJ 


configuration  ( EMC ) ) . 

It  is  more  convenient  to  work  with  continuous  variables  than 
with  discrete  variables.  Thus  we  define  new  design  constants 

6*/fT 


S-'kn^ 


W1  “ 0 ’ c2  = a ' " a 

which  we  regard  as  nonnegative  continuous  variables . Then  the 
design  constants  (ni>n2,h|c^)  can  be  approximated  by  solving 
the  following  continuous  optimization  problem: 


■2  , . h,5r 

— > d - 


(3.9) 


For  given  k and  specified  P*  choose  (c^,C2>d)  to 


...  2 2 
minimize  kc  + kc2 


.k-1 


(x+d)  - <$k  ^(x-d)}d$>(x) 


subject  to 


^2  1^c1+d_x1>  J :l+c2~x2 I ^ 

x d$2(x1,x2|/p)  > p*  (3.10) 


2 2 2 

where  p = c /(c  +c  ) and  c..  ,c  ,d  > 0.  We  denote  the  solution 


to  (3.10)  by  (c^,c2,d|c^)  and  for  specified  <5*  use  the  approximate 
design  constants 

r/r  nXo)  r/n  A 0-1 

d<5* 

6 s' 


nl = 


vV 


5 n. 


, h = 


(3.11) 


where  [z]  denotes  the  smallest  integer  >.  z,  to  implement 


P2(C1>- 


For  k = 2 we  note  that  (3.10)  can  be  written  as: 


For  specified  P*  choose  (c^c^d)  to 


minimize  2c^  + - 4>(-d//2")} 


subject  to  <t>2C  (ci+ d)//2,  i4^+c^//2|  t^p]  >_  P*.  (3.12) 
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4.  CONSTANTS  TO  IMPLEMENT  P (.C  ) FOR  k > 3 

4.1  Tables  of  Constants  to  Implement  CC^ ) 

Table  I contains  constants  (c^,c2 ,d| C^)  necessary  to 
approximate  (n^,n2,h|C^)  for  k = 2 and  selected  P* ; these 
constants  are  the  solutions  to  (3.12).  Also  included  in  Table  I 
are  the  constants  (c^,c2,d|E)  and  (c^,c2,d|Cg)  necessary  to 
implement  the  exact  procedure  and  ^2^3^’  respectively;  these 
were  given  in  T-B  and  Tamhane  (1975),  respectively.  We  include 
the  latter  constants  here  in  order  for  the  reader  to  see  how  the 
constants  (c^,c2,d|C^)  compare  with  them;  however,  we  emphasize 
that  in  practice  one  would  only  use  (c^,c2>d|E)  since  these 
constants  are  optimal. 

Table  II  contains  constants  (c^,c2,d|C^)  necessary  to 

approximate  (n^,n  .hlc^)  for  k = 3(1)10,12,15,25  and  selected 

P*;  these  constants  are  the  solutions  to  (3.10).  For  fixed 

(large)  P*  we  have  found  that  c^  and  c2  are  approximately 

linear  in  log  (k-1);  similarly,  for  fixed  k and  (large)  P*, 

^ 2 ^ 2 ® 

c.^  and  c2  are  approximately  linear  in  loge{P*/(l-P*)).  We 

have  not  been  able  to  characterize  the  behavior  of  d in  a 

simple  way.  The  constants  given  in  Table  II  for  k = 6(1)9  and 

12  were  obtained  by  quadratic  interpolation  of  c^ , c^  and  d 

against  log  (k-1)  for  fixed  P*  = 0.75,  0.90,  0.95  and  0.99; 
e 

these  constants  (particularly  d)  are  not  as  accurate  as  the  ones 
which  served  as  the  basis  for  the  interpolation. 

4.2  Details  of  Computations  of  Constants 

All  of  the  computations  were  carried  out  on  Northwestern's 
CDC  6600  computer  in  32-bit  arithmetic.  The  generalized  reduced 
gradient  (GRG)  algorithm  of  Abadie  and  Carpentier  (1969)  was  used 
to  solve  the  constrained  nonlinear  optimization  problems  (3.10) 
and  (3.12).  The  constants  (c  ,c  ,d)C  ) given  in  T-B  were  used 
as  initial  guesses  in  the  GRG  algorithm;  even  with  these  rela- 
tively "good"  guesses,  at  least  10  and  often  more  iterations  were 


TABLE 


100 


TABLE  II  (continued) 


0.99 

3.243 

3.231 

1.349 

12 

0.95 

2.492 

2.858 

1.318 

0.9Q 

2.120 

2.630 

1.256 

Q.75 

1.541 

2.100 

1.468 

0.99 

3.272 

3.376 

1.384 

0.95 

2.532 

3.001 

1.346 

15 

0.90 

2.174 

2.791 

1.330 

0.75 

1.588 

2.350 

1.364 

0.60 

1.235 

1.667 

1.979 

0.99 

3.340 

3.649 

1.463 

0.95 

2.621 

3.302 

1.411 

25 

0.90 

2.271 

3.121 

1.358 

0.75 

1.704 

2.809 

1.256 

0.60 

1.360 

2.732 

1.099 

0.45 

1.014 

2.227 

1.219 

required  to  arrive  in  the  neighborhood  of  the  absolute  optimum. 

The  objective  function  is  relatively  flat  in  the  region  of  the 
absolute  optimum  which  results  in  extremely  slow  convergence  in 
the  later  iterations.  A maximum  limit  of  25  was  placed  on  the 
number  of  iterations,  and  the  algorithm  was  terminated  sooner 
if  no  change  was  observed  in  the  first  four  significant  digits 
of  the  objective  function  in  five  successive  iterations.  Thus 
we  would  expect  our  solutions  to  be  reasonably  close  to  the 
absolute  optima. 

Each  iteration  of  the  GRG  algorithm  corresponds  to  at  least 
one  (and  often  many)  evaluations  of  the  double  integral  in  the 
constraint  of  (3.10)  and  its  partial  derivatives  with  respect  to 
c , c^  and  d.  We  evaluated  these  partials  numerically  by  taking 
the  value  of  the  double  integral  at  the  current  solution 
(c^,c  ,d)  as  the  base  value,  say  i|i(c^  jCj  »d) , and  approximated 
the  derivatives  as:  3^/ac^  ^ {^(c^+AjC^ ,d)  - ^(c^c^d^/A, 
etc.,  where  A = 10-4.  Since  it  was  necessary  to  evaluate  the 
double  integral  in  the  constraint  of  (3.10)  at  least  four  and 
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often  many  more  times  in  each  iteration  of  the  GRG  algorithm,  a 
fast  and  accurate  method  was  sought  for  this  purpose. 

We  used  Monte  Carlo  (MC)  sampling  to  estimate  the  value  of 
this  double  integral  by  noting  that  it  equals 

E{^'1Cc1+d-X1,  ^-X2,^}  (4.1) 

where  X^  and  X^  are  standard  normal  r.v.'s  with  Corr{X^,X2}  = 

»/p  = c^/vo.  J+c2.  Each  estimate  of  (4.1)  was  based  on  an  average 
over  200  runs.  In  each  run  was  generated  by  first 

generating  a pair  (Z^,Z2)  of  independent  standard  normal  r.v.'s 
using  the  Box-Muller  algorithm  and  then  employing  the  transforma- 
tion X^  = Z^,  X2  = i/p  + A-p  Z2<  The  Fortran  library  program 
RANF  was  used  to  generate  the  uniform  [0,1]  r.v.'s  needed  as 
inputs  for  the  Box-Muller  algorithm. 

We  evaluated  4>2  using  Borth's  (1973)  modification  of  Owen's 
(1958)  method.  This  modification  is  based  on  the  fact  that 
subject  to  a specified  accuracy,  Owen's  method  is  fast  only  for  a 
certain  range  of  values  of  the  parameters  of  $2;  for  other 
values  of  the  parameters  a computing  method  proposed  by  Borth  is 
faster.  The  modified  method  which  is  a composite  of  the  two 
methods  is  thus  faster  than  Owen's  method.  For  the  Owen  method 
part  we  used  the  IMSL  subroutine  MDBNOR  for  which  a limit  on  the 
maximum  error  in  evaluating  <t>2  is  specified  to  be  10  5 in  the 
IMSL  manual.  For  the  Borth  method  part  the  limit  on  the  maximum 

error  in  evaluating  a certain  T-function  necessary  to  obtain  $ 

-7  ^ 

(see  equation  (2)  of  Borth  (1973))  is  specified  to  be  10  in  his 

article.  In  addition  to  the  T-function  it  is  also  necessary  to 

compute  some  standard  normal  cdf  values  to  obtain  $2*  (All  of 

these  steps  are  carried  out  internally  in  MDBNOR. ) We  used  the 

approximation  to  $ given  by  equation  26.2.17  of  Abramowitz  and 

—8 

Stegun  (1964)  which  is  accurate  to  within  ±7.5  x 10  . Thus  the 
overall  accuracy  in  the  evaluation  of  $>  may  be  estimated  to  be 

_5  *■ 

±10  . (We  mention  that  we  tried  to  use  Cadwell's  (1951) 
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approximation  to  but  found  its  accuracy  to  be  unacceptable 

for  our  purposes . 1 

For  evaluating  the  double  integral  we  also  tried  using  the 
Gauss-Legendre  quadrature  method  (with  the  integrals  over  -® 
to  +“  approximated  by  integrals  over  -6  to  +6)  with  16  nodes, 
i.e.,  256  integrand  evaluations.  Although  this  method  requires 
slightly  more  computer  time  than  the  MC  method  with  200  runs , we 
found  that  in  general  the  MC  method  gave  a more  accurate  estimate 
of  the  double  integral;  hence  we  adopted  the  MC  method. 

For  evaluating  the  single  integral  in  the  objective  function 
of  (3.10)  we  used  the  Romberg  quadrature  method  for  which  the 
maximum  error  was  controlled  at  10  $ appearing  in  the  inte- 

grand was  evaluated  using  the  approximation  given  in  Abramowitz 
and  Stegun  (1964)  referred  to  earlier. 

The  reported  values  of  c^,  c^,  d are  rounded  off  in  the 
fourth  significant  digit  and  are  estimated  to  be  correct  to  at 
least  the  first  three  significant  digits. 

5.  THE  PERFORMANCE  OF  P(-)  RELATIVE  TO 

As  a measure  of  the  efficiency  of  P^  (Bechhofer  (1954)) 
relative  to  that  of  P 2 when  both  guarantee  the  same  probability 
requirement  (2.1),  we  consider  the  ratio  (termed  relative  effi- 
ciency (RE(P  :P2)) 

E tT|P0>/kn,  (5.1) 

y 2 

2 /v 

where  n = [(co/5*;  ],  and  c is  the  solution  of 

r°° 

$k_1(x+c)d4>(x)  = P*.  (5.2) 

— 00 

Clearly  RE  depends  on  y and  { 6*  ,P* } ; values  of  RE  less  than 
one  favor  P 2 over  P, . To  remove  the  dependence  on  6*  we  use 
the  continuous  approximations  to  E)J{T|P2}  and  n (thereby 
ignoring  the  fact  that  the  sample  sizes  must  be  integers ) . RE 
is  then  given  by 
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|^kCl  + c2  J j ^ 11  *(x+d+<5i  jC1/6*) 

m 

- H $(x-d+<5.  r /5*)l  d$(x)l /kc2  (5.3) 

j=i  1,3 1 ; J 

j*i 

• A A * 

where  we  employ  in  (5.3)  the  (c^ ,c2 ,d)-values  of  the  particular 
procedure  P 2 being  compared  to  P . The  value  of  c which  is 
the  solution  to  (5.2)  has  been  tabulated  for  selected  k and  P* 
by  Bechhofer  (1954),  Gupta  (1963),  and  Milton  (1963);  Bechhofer's 
* = c,  Gupta's  and  Milton's  H = c/^2*. 

Table  III  gives  computed  RE-values  for  RE(P^;P  (E)), 

RE(  P,  : ^2(C]_) ) , and  RE(P  :P^(C^))  for  k = 2 and  selected  P* , 
while  Table  IV  gives  analogous  values  for  RE(  P^ : P^C^ ) ) for 
k £ 3.  The  computed  values  given  in  Tables  III  and  IV  were 
obtained  using  the  (c^c^dj-values  listed  in  Tables  I and  II, 
respectively.  Table  V is  an  abbreviated  one  which  permits  com- 
parison of  p2^C2*  and  P2^C3^  Mith  via  REtP^CC.)) 

i = 1,2,3  for  selected  extreme  (k,P*)-combinations. 

Since  P^  is  a special  case  of  P2(E)  and  P2(C^)  d = 

( see  Remark  3.2),  it  follows  that  for  k = 2 we  have 

1 > max(RE(P  :P  (E)),  RE(P  :P  (CJ  i = 1,2,3},  (5.4) 

and  for  k > 3 we  have 

1 > max{RE(P  :P2(Ci))  i = 1,2,3}  (5.5) 

for  all  u_  e ft.  Thus,  our  two-stage  procedures  ?2(E)  and 
P2(Ci)  (i  = 1,2,3)  are  uniformly  (in  £)  better  than  the  corre- 
sponding single-stage  procedure  P^  when  all  guarantee  the  same 
probability  requirement  (2.1).  Moreover,  when  the  constraint  in 
(2.2)  is  replaced  by  A(<5*)  >,  P*,  and  then  by  B(6*)  ^ P*,  and 
finally  by  C(<S*)  >_  P*,  we  have  as  a consequence  of  (3.3)  that 
the  set  of  feasible  values  of  (c^.c^d)  decreases  at  each  step. 
Thus  the  corresponding  minima  of  the  objective  function  of  (2.2) 
increase  at  each  step,  and  we  have  for  k ^ 3: 

"W'rW*  « “WWV'  < "WWV’  (5-6) 
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where  EMC  refers  to  the  configuration  Uj-jj  = This  implies 

that  not  only  is  P^C^)  U-minimax  among  our  two-stage  procedures 
based  on  the  conservative  lower  bound  AC<S*)  i P*,  but  also  that 
it  is  U-minimax  among  our  two-stage  procedures  based  on  the  con- 
servative lower  bounds  B(S*)  >,  P*  or  C ( 6* ) 4 P*  as  well. 

These  findings  and  ones  described  in  the  sections  below  demonstrate 
that  P^C^)  is  highly  effective  as  a selection  procedure  with 
screening. 


5.1  P2( E),  ?2 ( Ci ) and  vs* 


for  k = 2 


We  have  discussed  RECP^:P2CE))  RECPi^CCg))  in 

Section  9.1  of  T-B.  From  Table  III  we  note  that  the  performance 
of  P (Ci)  is  always  "intermediate"  to  that  of  P^( E)  and 
P2(Cg)  at  5 = 0.  The  range  of  6 > 0 values  for  which  P2(C^) 
is  intermediate  depends  on  P*;  thus,  e.g. , our  computations 
indicate  that  for  S = » we  have  P (C^)  as  inter™ediate  when 
P*  >.0.99  but  as  poorest  when  P*  <_  0.95. 


5.2  P2CC1)  vs. 


for  k > 3 


The  performance  of  P2(C^)  relative  to  that  of  P^  can  be 
studied  using  the  RE-values  given  in  Table  IV.  We  note  that  for 
fixed  k and  P*,  RE  is  a decreasing  function  of  the  differences 
y (1  4 i < k-1);  thus  P^C^)  capitalizes  on 
favorable  configurations  of  the  (1  < i < k).  The  columns 


headed 


= 00  represent  the  minimum 


and  maximum  RE  (measured  in  terms  of  E^{T|P}),  respectively, 
achieved  when  P2(C  ) use<^  *n  Place  of  as  noted  earlier, 

based  on  this  criterion  the  experimenter  always  gains  using 

We  also  note  that  for  fixed  u.  and  P*,  RE  is  a decreasing 
function  of  k.  Thus  the  screening  feature  of  P^C^)  becomes 
more  effective  as  k increases. 

In  the  range  of  P*-values  for  which  computations  were  made 
(these  being  the  ones  of  greatest  practical  interest),  we  note 


l 


20 


i 


that  for  fixed  u_  and  k.,  RE  decreases  and  then  increases  as  P* 
increases;  in  fact,  RE  -►  1 as  P*  -►  1/k  or  1, 

5,3  P^C.^),  vs’  P1  f°r  k = 3 

Table  V which,  gives  :P2(C . ) ) i = 1,2,3  for  four 

"extreme"  (k,P*)-combinations,  namely,  k = 3 and  25  and 

P*  = 0.75  and  Q.99,  can  be  used  to  compare  the  performances 
of  P2(C^)  = over  a considerable  portion  of  the  range 

of  practical  interest  of  these  parameters.  The  values  for 
REfP^P^CC^))  are  taken  from  Table  IV  of  the  present  paper,  those 
for  RE(P^:P2(C2))  from  Table  IV  of  T-B,  and  those  for 
RECPfi^CC  ))  were  computed  just  for  inclusion  in  this  table. 

We  first  note  that  for  fixed  k and  P*  we  have , over  the 

range  of  (k,P*) -values  considered,  that 

REW(P1:P2(C1))  * REu(?1:P2(C2))  < REp(Pi:P2(C3))  (5.7) 


for  all  e ft  (including 

is  in  contrast  to  the  results  for  k 


= °°}).  This  latter 
2 where  it  was  found  that 


RE/Pl:P2(Cl))  > REm(P1:P2(C3)) 
For  fixed  u and  k,  and 


fur  “[2]  ' "ti:  5 *• 

P*  •+•  1,  the  RE-values  are  close 


for  the  P2(C^)  * = 1,2»3’  this  is  so  since  n^  -*■  ® and  hence 

the  P^C^)  Pi»  however,  the  effect  on  the  RE-values  of  P*  ->  1 
depends  critically  on  k for  each  of  the  ^^i^’  por  ^-xec*  P * 
"close  to  unity"  we  see  that  the  RE-values  of  the  P^CL)  are 
closer  for  large  k than  for  small  k. 


For  fixed  u and  k,  and  P*  -*•  1/k,  our  computational 


results  indicate  that  the  minimax  solution  (n^,n2,h)  of  (3.9) 

is  such  that  h -*■  ” and  hence  P„(C.)  P,  (i  = 1,2,3);  thus 

2 i 1 

the  three  procedures  perform  similarly.  However,  the  value  of  P* 


(call  it  P|)  at  which  h becomes  large  enough  so  that  P2(Cj,) 
becomes  "almost  equivalent  to"  P^  is  such  that  P*  < P*  < F*. 
Thus  for  moderate  values  of  P*  we  find  that  P2(C^)  suPer^-or 
(in  terms  of  RE)  to  P2(C2)  which  in  turn  is  superior  to  ^2^3^“ 
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6 . NUMERICAL  EXAMPLE 

In  this  section  we  give  a numerical  example  to  illustrate 
the  use  of  Tables  II  and  IV  for  P (Ci),  In  addition,  we  compare 
the  performance  (.in  terms  of  E^tT  | P> ) of  P^Cj.)  w*tfl  that  of 
the  single-stage  procedure  P and  the  open  sequential  procedure 
Pg(BKS)  of  Bechhofer,  Kiefer  and  Sobel  (1968,  Section  12.6.1.1) 
which  samples  a vector-at-a-time . The  example  will  show  in  a 
striking  way  the  trade-offs  that  the  experimenter  has  at  his 
disposal  when  choosing  among  procedures  that  guarantee  (2.1). 

Suppose  that  k = 10,  a = 10  and  that  the  experimenter 
specifies  6*  = 2,  P*  = 0.90;  we  then  anticipate  large  sample 
sizes  since  for  k = 10  the  specification  6*/a  - 0.2,  P*  = 0.90 
is  a demanding  one. 

a)  To  determine  the  constants  necessary  to  implement  ^2^C1^’ 
we  obtain  from  Table  II:  c^  = 2.067,  c^  = 2.507,  d = 1.342. 

Using  (3.11)  these  yield  n = [(2. 067/0. 2)2]  = [106.8]  = 107, 

n2  = [(2.507  / 0.2)2]  = [157.1]  = 158,  h = 1. 342(2 )/2. 067  = 1.299. 

b)  To  determine  the  constant  necessary  to  implement  P , 
we  obtain  from  Table  I of  Bechhofer  (1954)  that  c of  our  (5.2) 
is  2.9829.  Thus  n = [(2.9829(10)/2)2]  = [222.4]  = 223,  which 
is  the  number  of  observations  required  from  each  of  the  10 
populations . 

c)  We  obtain  estimates  of  E {T | P (BKS)}  from  Tables  18.4.5 

ii  ^ 

and  18.4.10  of  BKS  (1968)  for  the  LFC  and  the  EMC,  respectively; 
these  are  1453  and  2906,  respectively. 

The  above  results  are  summarized  in  Table  VI . 

The  entries  in  Table  VI  illustrate  the  savings  in  E { T | P} 
when  P2(C^)  use<*  Place  °f  P-^.  In  addition,  the  entries 
for  P (BKS)  show  the  dramatic  further  savings  that  can  be 
achieved  using  that  procedure  if  sequential  sampling  is  a viable 
method  of  experimentation  for  the  practical  problem  at  hand  and 
it  is  anticipated  that  the  largest  population  means  are  not  too 
close.  However,  it  must  be  emphasized  here  that  we  are  presently 


TABLE  VI 

E;j{T|F}  for  P1,  P2(C1),  and  Pg(BKS) 
for  Selected  Configurations  of  the  Population  Means 
when  k = IQ,  <5*/a  = 0,2,  P*  = 0.90 


Procedure 

E^{T | P} 

EMC 

LFC 

U[10]  ' W[9]  = " 

pi 

2230 

2230 

2230 

W 

1775— ^ 

- 

isio^7 

1070^ 

PS(BKS) 

2906 

1453 

10 

Mote:  The  entries  for  P (C  ) in  Table  VI  were  computed  by- 
multiplying  the  relevant  relative  efficiencies  in  Table  IV  by  2230. 

focusing  on  E{T}.  The  distribution  of  T has  a very  large 
standard  deviation  for  PgCBKS)  when  E(T}  is  large  (see  Tables 

18.4.5  and  18.4.10  of  BKS  (1968)  where  for  P*  = 0.90  the 
estimated  standard  deviations  of  T are  given  as  713.4  and 

1744.6  for  the  LFC  and  EMC,  respectively)  and  is  highly  skewed 
to  the  right;  thus  if  the  experimenter  uses  this  procedure  he 
must  be  prepared  to  accept  occasional  very  large  values  of  T. 

The  closed  sequential  procedure  of  Paulson  (1964)  (with  the 
improvement  of  Fabian  (1974))  which  samples  a vector-at-a-time 
and  eliminates  populations  is  superior  to  P (BKS)  in  terms  of 
E^(t|P}  over  certain  ranges  of  values  of  k and  P*;  for  the 
problem  at  hand  with  k = 10,  P*  = 0.90,  6*/a  = 0.2  and 

Paulson's  design  parameter  \ = S*/2  we  have,  using  Fabian's 
improvement,  that  the  maximum  number  of  stages  to  terminate 
experimentation  for  Paulson's  procedure  is  381,  and  hence  an 
upper  bound  on  E { T [ P>  which  is  extremely  conservative 
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(.since  it  does  not  take  into  account  the  fact  that  populations  are 
permanently  eliminated  prior  to  termination  of  experimentation) 
is  3810.  Here  again  the  problem  at  hand  must  be  such,  that  a 
completely  sequential  procedure  is  feasible,  and  the  experimenter 
must  be  prepared  to  accept  very  large  values  of  T. 

7 .  CONCLUDING  REMARKS 

We  feel  that  in  spite  of  the  relatively  heavy  financial  costs 
involved  in  obtaining  the  design  constants  necessary  to  implement 
P^CC^),  they  were  justified  by  the  final  results.  For  we  have 
been  able  to  demonstrate  conclusively  that  P^C^)  represents 
a significant  improvement  over  both  P ( C ) and  P (C„)  (as 
well  as  over  P^ ) . And  we  now  can  offer  a highly  effective  two- 
stage  procedure,  incorporating  screening,  which  is  easy  to 
implement . 


8.  DIRECTIONS  OF  FUTURE  RESEARCH 


In  Section  10  of  T-B  we  postulated  several  unsolved  problems 
associated  with  P . All  of  these  still  remain  open  problems. 

The  most  important  of  these  (at  least  from  a theoretical  point  of 
view)  is  that  of  determining  the  LFC  of  the  vu  for  k > 2.  If 


for  evaluating  the  exact  P^{CS|P2}  for 


the  conjectured  LFC  = jj  = u£k]  ~ can  Prove^ 

to  be  the  true  one,  and  if  an  efficient  algorithm  can  be  found 

4_  in  the  LFC,  then 

it  would  be  of  considerable  interest  to  determine  how  much  decrease 

in  Sup  E {T|P}  can  be  achieved  if  P (E)  is  used  in  place  of 
peQ  - 2 

P2CC1)  for  k > 3. 
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